Goto

Collaborating Authors

 parametric rate




Reviews: Minimax Estimation of Maximum Mean Discrepancy with Radial Kernels

Neural Information Processing Systems

The paper is very interesting as it provides the first known lower bounds for the minimax rate of estimation of the MMD distance between two distributions based on finite random samples, considering general radial kernels, which match with the known upper bounds up to constants. Theses bounds lead to a classical parametric rate of estimation, not depending on the dimension of the data. The text could however be more fluent, and easier to read for nonspecialists of the minimax estimation theory. For instance, the results are all expressed with respect to the minimax probabilities. Could the link with the more classical minimax risk be clearly written?


Multivariate root-n-consistent smoothing parameter free matching estimators and estimators of inverse density weighted expectations

Holzmann, Hajo, Meister, Alexander

arXiv.org Machine Learning

Expected values weighted by the inverse of a multivariate density or, equivalently, Lebesgue integrals of regression functions with multivariate regressors occur in various areas of applications, including estimating average treatment effects, nonparametric estimators in random coefficient regression models or deconvolution estimators in Berkson errors-in-variables models. The frequently used nearest-neighbor and matching estimators suffer from bias problems in multiple dimensions. By using polynomial least squares fits on each cell of the $K^{\text{th}}$-order Voronoi tessellation for sufficiently large $K$, we develop novel modifications of nearest-neighbor and matching estimators which again converge at the parametric $\sqrt n $-rate under mild smoothness assumptions on the unknown regression function and without any smoothness conditions on the unknown density of the covariates. We stress that in contrast to competing methods for correcting for the bias of matching estimators, our estimators do not involve nonparametric function estimators and in particular do not rely on sample-size dependent smoothing parameters. We complement the upper bounds with appropriate lower bounds derived from information-theoretic arguments, which show that some smoothness of the regression function is indeed required to achieve the parametric rate. Simulations illustrate the practical feasibility of the proposed methods.


Exploiting Observation Bias to Improve Matrix Completion

Mann, Sean, Park, Charlotte, Shah, Devavrat

arXiv.org Artificial Intelligence

We consider a variant of matrix completion where entries are revealed in a biased manner, adopting a model akin to that introduced by Ma and Chen. Instead of treating this observation bias as a disadvantage, as is typically the case, our goal is to exploit the shared information between the bias and the outcome of interest to improve predictions. Towards this, we propose a simple two-stage algorithm: (i) interpreting the observation pattern as a fully observed noisy matrix, we apply traditional matrix completion methods to the observation pattern to estimate the distances between the latent factors; (ii) we apply supervised learning on the recovered features to impute missing observations. We establish finite-sample error rates that are competitive with the corresponding supervised learning parametric rates, suggesting that our learning performance is comparable to having access to the unobserved covariates. Empirical evaluation using a real-world dataset reflects similar performance gains, with our algorithm's estimates having 30x smaller mean squared error compared to traditional matrix completion methods.


On How Well Generative Adversarial Networks Learn Densities: Nonparametric and Parametric Results

Liang, Tengyuan

arXiv.org Machine Learning

We study in this paper the rate of convergence for learning distributions with the Generative Adversarial Networks (GAN) framework, which subsumes Wasserstein, Sobolev and MMD GANs as special cases. We study a wide range of parametric and nonparametric target distributions, under a collection of objective evaluation metrics. On the nonparametric end, we investigate the minimax optimal rates and fundamental difficulty of the density estimation under the adversarial framework. On the parametric end, we establish theory for neural network classes, that characterizes the interplay between the choice of generator and discriminator. We investigate how to improve the GAN framework with better theoretical guarantee through the lens of regularization. We discover and isolate a new notion of regularization, called the \textit{generator/discriminator pair regularization}, that sheds light on the advantage of GAN compared to classic parametric and nonparametric approaches for density estimation.